%%::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
%%
%%    This file is part of ICTP RegCM.
%%    
%%    Use of this source code is governed by an MIT-style license that can
%%    be found in the LICENSE file or at
%%
%%         https://opensource.org/licenses/MIT.
%%
%%    ICTP RegCM is distributed in the hope that it will be useful,
%%    but WITHOUT ANY WARRANTY; without even the implied warranty of
%%    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
%%
%%::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

\label{obtaindata}

The first step to run a test simulation is to obtain static data to localize
model DOMAIN and Atmosphere and Ocean global model dataset to build initial
and boundary conditions ICBC to run a local area simulation.

ICTP maintains a public accessible web repository of datasets on:

{\bf http://clima-dods.ictp.it/regcm4 }

We will in the following substitute this URL with a shell variable:

\begin{Verbatim}
$> export ICTP_DATASITE=http://clima-dods.ictp.it/regcm4
\end{Verbatim}

As of now you are requested to download required global data on your local disk
storage before any run attempt.

Here we will show you how to use command line download tools curl and
wget to get data.

\section{Global dataset directory Layout}

You are suggested to establish a convenient location for global datasets
on your local storage. Keep in mind that required space for a year of GLOBAL
data can be as large as 250 GBytes with the original resolution!

Having this in mind, we will now consider that you the user have identified
on your system or have network access to such a storage resource to store say
100 GB of data, and have it reachable on your system under the
\verb=$REGCM_GLOBEDAT= location.
On this directory, you are required to make the following directories:

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> mkdir SURFACE CLM45 CMIP6 ERA5
\end{Verbatim}

This does not fill all possible global data sources paths, but will be enough
for the scope of running the model for testing its capabilities.

\section{Static Surface Dataset}

The model needs to be localized on a particular DOMAIN. The needed information
are topography, land type classification and optionally lake depth (to run the
Hostetler lake model) and soil texture classification.

This means downloading four files, which are global archives at $30 second$
horizontal resolution on a global latitude-longitude grid of the above data.

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> cd SURFACE
$> curl -o GTOPO_DEM_30s.nc ${ICTP_DATASITE}/SURFACE/GTOPO_DEM_30s.nc
$> curl -o GLCC_BATS_30s.nc ${ICTP_DATASITE}/SURFACE/GLCC_BATS_30s.nc
$> curl -o GLZB_SOIL_30s.nc ${ICTP_DATASITE}/SURFACE/GLZB_SOIL_30s.nc
\end{Verbatim}

Optional Lake dataset:

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> cd SURFACE
$> curl -o ETOPO_BTM_30s.nc ${ICTP_DATASITE}/SURFACE/ETOPO_BTM_30s.nc
\end{Verbatim}

\section{CMIP6 Dataset}

The data in this directory come from the Input4Mip (
{\bf https://aims2.llnl.gov/search/input4mips/ } ). The data are the common
radiative forcings used in the CMIP experiments.

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> cd CMIP6
$> mkdir AEROSOL GHG O3 SOLAR
$> for dir in AEROSOL GHG O3 SOLAR; do cd $dir; \
    wget ${ICTP_DATASITE}/CMIP6/$dir/ -O - | \
    wget -A ".nc" -l1 --no-parent --base=${ICTP_DATASITE}/CMIP6/$dir/ \
      -nd -Fri -; done
$> cd $REGCM_GLOBEDAT
\end{Verbatim}

\section{CLM 4.5 Dataset}
\label{clm45data}

To use the \verb=CLM 4.5= option in the model, you need a series of files
with global land surface characteristics datasets.

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> cd CLM45
$> mkdir megan pftdata snicardata surface
$> for dir in megan pftdata snicardata surface; do cd $dir; \
    wget ${ICTP_DATASITE}/CLM45/$dir/ -O - | \
    wget -A ".nc" -l1 --no-parent --base=${ICTP_DATASITE}/CLM45/$dir/ \
      -nd -Fri -; done
$> cd $REGCM_GLOBEDAT
\end{Verbatim}

This is the input file for the \verb=mksurfdata= program (see at \ref{clm45}).

\section{Atmosphere and SST temperature Global Dataset}

The model needs to build initial and boundary conditions for the regional scale,
interpolating on the RegCM grid the data from a Global Climatic Model output.
The GCM dataset can come from any of the supported models, but we will for now
for our test run download just the ERA5 dataset for June 2020
(Jun 01 00:00:00 UTC to Jun 30 18:00:00 UTC). The dataset used in this tutorial
is downgraded from the original (0.25x0.25) resolution down to (1.5x1.5) degrees
to reduce the volume and bandwidth required for the data transfer from the
ICTP. Original resolution dataset (36 times larger for the global data) can be
obrained from Copernicus CDS using the cdsapi (example scripts can be had
from ICTP in {\bf http://clima-dods.ictp.it/regcm4/ERA5/progs }.

\begin{Verbatim}
$> cd $REGCM_GLOBEDAT
$> cd ERA5
$> mkdir 2020 SSTD
$> cd 2020
$> for type in geop qhum tatm uwnd vwnd
   do
     curl -o ${type}_2020_06.nc \
        ${ICTP_DATASITE}/ERA5/lowres_1.5/2020/${type}_2020_06.nc
   done
$> cd ../SSTD
$> curl -o sst_2020_06.nc \
      ${ICTP_DATASITE}/ERA5/lowres_1.5/SSTD/sst_2020_06.nc
$> cd $REGCM_GLOBEDAT
\end{Verbatim}

With this datasets we are now ready to go through the RegCM Little Tutorial
in the next chapter of this User Guide.
